Dyck language

In the theory of formal languages of computer science, mathematics, and linguistics, the Dyck language is the language consisting of balanced strings of parentheses [ and ]. It is important in the parsing of expressions that must have a correctly nested sequence of parentheses, such as arithmetic or algebraic expressions. It is named after the mathematician Walther von Dyck.

Contents

Formal definition

Let Σ = { [, ] } be the alphabet consisting of the symbols [ and ] and let Σ denote its Kleene closure. For any element u ∈ Σ with length |u| we define partial functions insert : Σ × (N ∪ {0}) → Σ and delete : Σ × N → Σ by

insert(u, j) = u with "[]" inserted into the jth position
delete(u, j) = u with "[]" deleted from the jth position

with the understanding that insert(u, j) is undefined for j > |u| and delete(u, j) is undefined if j > |u| − 2. We define an equivalence relation R on Σ as follows: for elements a, b ∈ Σ we have (a, b) ∈ R if and only if there exists a finite sequence of applications of the insert and delete functions starting with a and ending with b, where the empty sequence is allowed. That the empty sequence is allowed accounts for the reflexivity of R. Symmetry follows from the observation that any finite sequence of applications of insert to a string can be undone with a finite sequence of applications of delete. Transitivity is clear from the definition.

The equivalence relation partitions the language Σ into equivalence classes. If we take ε to denote the empty string, then the language corresponding to the equivalence class Cl(ε) is called the Dyck language.

Properties

See also

References